Greedy RankRLS: a Linear Time Algorithm for Learning Sparse Ranking Models

نویسندگان

  • Tapio Pahikkala
  • Antti Airola
  • Pekka Naula
  • Tapio Salakoski
چکیده

Ranking is a central problem in information retrieval. Much work has been done in the recent years to automate the development of ranking models by means of supervised machine learning. Feature selection aims to provide sparse models which are computationally efficient to evaluate, and have good ranking performance. We propose integrating the feature selection as part of the training process for the ranking algorithm, by means of a wrapper method which performs greedy forward selection, using leave-query-out crossvalidation estimate of performance as the selection criterion. We introduce a linear time training algorithm we call greedy RankRLS, which combines the aforementioned procedure, together with regularized risk minimization based on pairwise least-squares loss. The training complexity of the method is O(kmn), where k is the number of features to be selected, m is the number of training examples, and n is the overall number of features. Experiments on the LETOR benchmark data set demonstrate that the approach works in practice.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Papers and Presentations 7 KEYNOTE — The DDI Approach to Features : Don ’ t Do It

Ranking is a central problem in information retrieval. Much work has been done in the recent years to automate the development of ranking models by means of supervised machine learning. Feature selection aims to provide sparse models which are computationally efficient to evaluate, and have good ranking performance. We propose integrating the feature selection as part of the training process fo...

متن کامل

A Sparse Regularized Least-Squares Preference Learning Algorithm

Learning preferences between objects constitutes a challenging task that notably differs from standard classification or regression problems. The objective involves prediction of ordering of the data points. Furthermore, methods for learning preference relations usually are computationally more demanding than standard classification or regression methods. Recently, we have proposed a kernel bas...

متن کامل

Learning Preferences with Co-Regularized Least-Squares

Situations when only a limited amount of labeled data and a large amount of unlabeled data is available to the learning algorithm are typical for many real-world problems. In this paper, we propose a semi-supervised preference learning algorithm that is based on the multiview approach. Multi-view learning algorithms operate by constructing a predictor for each view and by choosing such predicti...

متن کامل

Large scale training methods for linear RankRLS

RankRLS is a recently proposed state-of-the-art method for learning ranking functions by minimizing a pairwise ranking error. The method can be trained by solving a system of linear equations. In this work, we investigate the use of conjugate gradient and regularization by iteration for linear RankRLS training on very large and high dimensional, but sparse data sets. Such data is typically enco...

متن کامل

Exact and Efficient Leave-pair-out Cross-validation for Ranking Rls

In this paper, we introduce an efficient cross-validation algorithm for RankRLS, a kernel-based ranking algorithm. Cross-validation (CV) is one of the most useful methods for model selection and performance assessment of machine learning algorithms, especially when the number of labeled data is small. A natural way to measure the performance of ranking algorithms by CV is to hold each data poin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010